WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 




(11) International Publication Number: 


WO 98/20706 


H04S 1/00 


Al 










(43) International Publication Date: 


14 May 1998 (14.05.98) 



(21) International Application Number: PCT/EP97/05902 

(22) International Filing Date: 25 October 1997 (25.10.97) 



(30) Priority Data: 

196 46 055.7 



7 November 1 996 (07. 1 1 .96) DE 



(71) Applicant (for all designated States except US): 

DEUTSCHE THOMSON-BRANDT GMBH [DE/DE]; 
Hermann-Schwer-Strasse 3, D-78048 Villin- 
gen-Schwenningen (DE). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): BOEHM, Johannes 
IDE/DE]; An der Strangriede 12, D-30167 Hannover 
(DE). SPILLE, Jens [DE/DE]; Kleines Feld 58, D-30966 
Hemmingen (DE). 

(74) Agent: HARTNACK, Wolfgang; Deutsche Thomson-Brandt 
GmbH, Licensing and Intellectual Property, G6ttinger 
Chaussee 76, D-30453 Hannover (DE). 



(81) Designated States: AL, AM, AT, AU f AZ, BA, BB, BG, BR, 
BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, PI, GB, GE, 
GH, HU, ID, IL, IS, JP, KE, KG, KP, KR, KZ, LC, LK, 
LR, LS, LT, LU, LV, MD, MG, MK, MN, MW, MX, NO, 
NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, 
TT, UA, UG, US, UZ, VN, YU, ZW, ARIPO patent (GH, 
KE, LS, MW, SD, SZ, UG, ZW), Eurasian patent (AM, AZ, 
BY, KG, KZ, MD, RU, TJ, TM), European patent (AT, BE, 
CH, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, 
PT, SE), OAPI patent (BF, BJ, CF, CG, CI, CM, GA, GN 
ML, MR, NE, SN, TD, TG). 



Published 

With international search report. 
Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: METHOD AND DEVICE FOR PROJECTING SOUND SOURCES ONTO LOUDSPEAKERS 




(57) Abstract 

For the purpose of spatial reproduction of an audio signal, the latter must be projected onto the positions of the existing loudspeakers. 
It is desirable in this case not to have to be fixed on a specific loudspeaker configuration for transmitting the audio signal. However, a 
problem here is that a multiplicity of possible combinations exists. In the method according to the invention, the sound sources (3) are 
interpreted as acoustic objects for the purpose of projecting them onto an arbitrary loudspeaker configuration (2). Here, an acoustic object 
consists in that in addition to the audio signal a sound source is assigned an item of spatial information which specifies a virtual, spatial 
position of the sound source. In order to reproduce an acoustic object, the spatial information of the sound source and the actual position 
of a loudspeaker are used to calculate the virtual distance from the sound source via the loudspeaker to the hearer (1). Before reproduction, 
separate processing (7, 8, 9) of the audio signal for each loudspeaker is then performed for each acoustic object. 



Best Available Copy 




FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


SZ 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


CI 


C6ie d' I voire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






cu 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






OK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 








WO 98/20706 PCT/EP97/05902 * 

- 1 - 

Method and device for projecting sound sources onto 

loudspeakers 

The invention relates to a method and a device 
5 for projecting sound sources onto loudspeakers in order, 
in particular, to permit spatial reproduction of the 
sound sources. 
Prior art 

It is known from the MPEG-2 Standard ISO 13818 to 

10 aim at a spatial representation by means of multichannel 
stereophony, also called surround sound, for audio 
reproduction. Six channels are provided in this case for 
the multichannel sound, of which three channels (left, 
centre, right) are arranged in space in front of the 

15 listener, two channels (left surround, right surround) 
are arranged in space behind the listener, and a sixth 
channel is provided for reproducing low-pitched tones for 
special effects. The sound channels are matrixed in 
order, on the one hand, to ensure reverse compatibility 

20 with MPEG-1 audio signals and, on the other hand, also to 
render satisfactory reproduction possible, if instead of 
a complete surround-sound loudspeaker configuration only 
a pair of loudspeakers are present. In this case, the 
calculated stereosignals are transmitted as MPEG-1- 

25 compatible stereosignal and the remaining signals as 
additional data. 
The invention 

It is the object of the invention to specify a 
method for spatial reproduction of virtual sound sources. 

30 This object is achieved by means of the method specified 
in Claim 1. 

It is the further object of the invention to 
specify a device for applying the method according to the 
invention. This object is achieved by means of the device 
35 specified in Claim 8. 

In order to reproduce an audio signal, the latter 
frequently has to be projected onto the positions of the 
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existing loudspeakers. A few projections may be mentioned 
here by way of example: 

a) The projection of a mono signal onto a pair of 
stereo loudspeakers . 
5 b) The projection of a 3/2-signal (3 loudspeakers in 
front/2 loudspeakers behind) onto a 2/2 loudspeaker 
arrangement . 

c) The projection of a signal with the position 3m 
away, 30° left, 10° high onto a loudspeaker ring 

10 which comprises 8 loudspeakers at a distance of 2m 

with a respective 45° spacing. 

d) The projection of 2 sound sources in the room onto 2 
loudspeakers . 

It is desirable not to have to be fixed on a 

15 specific configuration for the transmission of an audio 
signal. However, the problem arises in this case that 
there is an unlimited number of possible combinations. 

In principle, the method according to the 
invention for projecting sound sources onto loudspeakers 

20 consists in that the sound sources are interpreted as 
acoustic objects, an acoustic object consisting in that 
in addition to the audio signal a sound source is 
assigned an item of spatial information which specifies a 
virtual, spatial position of the sound source. 

25 The audio signal is advantageously processed as a 

function of the associated item of spatial information in 
order to reproduce an acoustic object. 

In this case, the spatial position of the 
loudspeakers is preferably additionally considered, the 

3 0 virtual distance of the sound source from the loudspeaker 
being calculated from the spatial information and the 
position of the loudspeakers, and separate processing of 
the audio signal for each of the loudspeakers being 
performed for an acoustic object. 

35 It is, furthermore, advantageous when one or more 

of the following parameters are considered when 
processing the audio signals: 
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- amplitude attenuation, for example by damping 
or diffraction, 

- a different propagation time for the various 
acoustic objects and loudspeakers, 

5 - consideration of the dependence of the 

loudspeaker level on the spatial arrangement by 
means of the outer ear function. 

In this case, the processing of the audio signals 
can be further improved when the frequency dependence of 
10 the parameters is also considered. 

The mathematical functions required for 
considering the parameters such as, for example, an 
attenuation function are preferably transmitted and/or 
stored as a function of the distance and/or the angle of 
15 deflection. 

It is particularly advantageous when the data of 
an acoustic object are stored and/or transmitted by means 
of a compressed data stream in accordance with the MPEG-4 
Standard. 

20 In principle, the device according to the 

invention for projecting sound sources onto loudspeakers 
consists in that an arithmetic unit is provided which 
calculates the distance of the virtual acoustic objects 
from the respective loudspeakers from an item of spatial 

25 information transmitted with the audio signal and the 
actual position of the loudspeakers. 

In this case, a memory is preferably provided in 
which the respective loudspeaker positions and/or 
mathematical functions for considering parameters are 

30 stored. 

It is advantageous to provide n x k actuators for 
n acoustic objects and k loudspeakers, an actuator 
carrying out processing of an audio signal with reference 
to one of the loudspeakers. 
35 In this case, a frequency dependence of the 

parameters is preferably also considered by the 
actuators, the signals firstly being resolved into 
frequency bands by a split filter (10), the individual 
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frequency bands then being processed individually, and 
the processed frequency bands subsequently being 
recombined by a merge filter (12) . 

It is particularly advantageous when the split 
5 filter and/or the merge filter are part of an audio 
decoder which is present in any case. 

Furthermore, one or more directional microphones 
can preferably be provided which are used to measure the 
loudspeaker position. 
10 The directional microphones are preferably 

integrated in a remote control. 
Drawings 

Exemplary embodiments of the invention will be 
described with the aid of the drawings, in which: 

15 Figure 1 shows virtual sound sources which are to be 
projected onto an existing pair of loudspeakers; 
Figure 2 shows the graphical representation of a model 

for calculating sound paths; 
Figure 3 shows the block diagram of a presentation 

20 circuit of the described model; and 

Figure 4 shows a section of an audio decoder according 

to the invention. 
Exemplary embodiments 

A typical problem arising is represented in 

25 Figure 1. Two virtual sound sources 3, violin and 
trumpet, are to be projected onto an existing pair of 
loudspeakers 2 such that the listener 1 has the 
impression that the violin and trumpet are located in the 
spatial positions represented in Figure 1. 

30 A model can be developed for such a projection, 

and is based on the following observation: that a person 
be located in a room having a plurality of windows which 
are all open. That there be various sound sources outside 
the room, also termed acoustic objects below, such as 

35 street musicians, a car horn etc., for example. The 
person can locate the various sound sources effectively 
in acoustic terms, even if they are not visible. This is 
based on the fact that the sound paths through the 
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various windows are different. The model described below 
is based on replacing each window by a loudspeaker. Given 
that the loudspeakers are correctly driven, the same 
sound field should result, and it should thus also be 
5 possible identically to locate the acoustic objects. 

A graphical representation of the model is 
represented in Figure 2. A listener 1 is located in an 
arbitrarily shaped room whose walls 5 consist of absorber 
material, with the result that no sound can penetrate 

10 from outside and no reflections are produced inside the 
room. The sound sources 3 are basically located outside 
the room. The loudspeakers or windows are taken into 
account by holes 6 in the wall of the room. This produces 
various sound paths 4 from the sound source 3 to the 

15 listener 1 through the various loudspeakers or window 
openings 6. The sound enters the room in this case 
through all loudspeakers or window openings, although 
each sound path has its own characteristics. 

A presentation circuit in which the model is 

20 converted is illustrated in the block diagram shown in 
Figure 3. Two acoustic objects 3, violin and trumpet, are 
projected in this case on the three existing loudspeakers 
2. For each acoustic object the audio signals are now 
processed as a function of the virtual spatial position 

25 of this acoustic object and the actual position of each 
loudspeaker, in order to permit driving in accordance 
with the respective virtual sound path. In a 
generalization to n acoustic objects and k loudspeakers, 
this means that n x k actuators are used. In this case, 

30 one or more of the following parameters 7, 8, 9 are 
considered in each of the actuators in accordance with 
the virtual sound path. In order to drive the amplitude 
correctly, the latter must firstly be calculated as a 
function of the path length. In addition, consideration 

35 can also be given to attenuation or absorption by the 
air. Different functions can be considered in this case 
depending on the type of the sound source or the 
attenuation of the air. Thus, a spherical sound source 



WO 98/20706 PCT/EP97/05902 % 

- 6 - 

loses its acoustic power with the square of the distance, 
that is to say the received power is given by the 
following formula : 

5 Received power (r) : = transmitted power/r 2 

By contrast, a cylindrical sound source such as a 
train or a street, for example, looses its acoustic power 
only with the simple distance. The respective functions 

10 can be stored in this case in the presentation circuit, 
but can likewise be transmitted and stored with the 
signal. They can likewise be determined by the respective 
application or the user. In addition, it is also possible 
to consider diffraction which occurs at the loudspeakers 

15 or the window openings. In order to be able to consider 
these diffraction effects precisely, the diffraction 
would have to be calculated by the sum of all sound paths 
by means of a specific hole geometry, taking the 
frequency and phase into consideration. This gives rise, 

20 in approximate terms, to the fact that at low frequencies 
propagation takes place in all directions independently 
of the angle of incidence, while at higher frequencies 
the amplitude of the audio signal is a function of the 
angle between the entry to and exit from the respective 

25 hole. An approximate formula can be used to reduce the 
outlay on computation. Such a formula can also, as 
already described in the case of attenuation, be 
transmitted at the same time or be set by the application 
or the user. Since the diffraction effects depend on 

30 frequency, it would be necessary to consider this 
dependence on frequency in order to be able to calculate 
the diffraction attenuation exactly. In order to realize 
this in technical terms, it is necessary either to use 
filters with defined group delay times, or to resolve the 

35 signals into frequency bands and process them 
individually. 

As represented in Figure 4, in this case the 
division could be performed by a split filter 10, 
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subsequent to which processing would be performed by 
various actuators 11 and, finally, the processed signals 
would be recombined by a merge filter 12. This can be 
integrated particularly well into a typical audio decoder 
5 for MPEG, AC 3 or ATRAC signals, since in their case 
processing is performed in the frequency domain and a 
split filter has already been provided for this purpose, 
with the result that there is no need to provide an 
additional split filter. 

10 A further parameter is the propagation time 

(delay) of the signal. It holds here in principle that 
the sound wave first impinging on the ear is decisively 
involved in the perception of direction. For a path 
length r and a mean velocity of sound c of approximately 

15 340 m/Sf it holds as: 

Delay ( r ) : = r/c 

In this case, the length r can be shortened by 
20 the shortest distance between the loudspeakers and the 

listener. This reduces the storage requirement in the 

presentation unit. 

There is a transfer function, also called the 

outer ear function, which is dependant on the direction 
25 and frequency, between a sound source and the human 

eardrum. In simple terms: the sound from the front is 

filtered differently by the ear muscles than the sound 

from behind. 

The outer ear function should be considered if 
30 the desire is to radiate a virtual sound source, 
positioned at the angle x, by means of a loudspeaker 
which is provided at the angle z. This requires the 
differential level signal between the virtual and 
loudspeaker positions to be determined and the signal to 
35 be appropriately filtered- Since the outer ear function 
is not the same for all people, it is conceivable to 
enable the user to choose between different outer ear 
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functions for the purpose of a particularly good 
correction. 

Here, as well, the filters can be realised by 
actuators in the frequency plane of an audio decoder. 
5 The actual loudspeaker position must be 

determined in order to determine the path length between 
the virtual acoustic object and the actual loudspeaker 
position. Various methods are conceivable for this. Thus, 
the user could measure the space coordinates of the 

10 respective loudspeaker boxes using a meter rule or 
similar, and input the corresponding distance data into 
an input device which relays these data to the 
presentation circuit. The input can be performed here via 
a keyboard on the appropriate device, or a remote 

15 control, it also being possible, if appropriate, to 
monitor the input data or for the user to be guided by an 
on-screen display on a display device or on a viewing 
screen. 

It is also possible to measure the loudspeaker 

20 system with the aid of one or more directional 
microphones, in order to save the user the mechanical 
measurement of the distances. The distance of the 
loudspeakers from the directional microphone or 
microphones can be determined in this case by reproducing 

25 via the loudspeakers a test sequence with pulses and by 
measuring the propagation time. The angles of the 
individual loudspeakers can then be determined via the 
directional characteristic of the directional 
microphones. It is then possible to measure the 

30 loudspeaker configuration automatically. In particular, 
it is self evident in this case to integrate the 
microphones in a remote control. 

The entire virtual path length is then yielded 
from the position of the virtual acoustic object and, as 

35 described above, the position determined for the 
respective loudspeaker. Various possibilities of 
representation are conceivable in this case for the two 
positions. Thus, this can be performed, for example, by 
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Cartesian coordinates, that is to say a specification of 
distance in all three directions in space, or by 
spherical coordinates, that is to say a specification of 
distance and the specification of the horizontal and, if 
5 appropriate, vertical angle. 

While the position of the loudspeaker should 
remain unchanged in most cases, a change in the virtual 
position of the acoustic objects can by all means 
frequently occur. This will be the case, in particular, 

10 whenever the audio signals are reproduced in 
accompaniment with video signals. Thus, for example, in a 
feature film an actor or a vehicle can move on the 
viewing screen or disappear from the screen and thus 
change his spatial position. It is likewise conceivable 

15 that in computer games having sound outputs a game 
participant is moved by the player, for example with the 
aid of a joystick, and that the reproduction of a sound 
signal, which is assigned to the game participant, is 
adapted in accordance with the position prescribed or 

20 altered by the player. 

The invention can be used to transmit, but also 
to record and reproduce digital audio signals, for 
example in accordance with the MPEG-4, MPEG-2 or AC3- 
Standards. This can be both pure audio signal 

25 reproduction, for example by a CD player, DAB or ADR 
receivers, and reproduction of the audio signals in 
conjunction with video signals, for example a DVD player 
or a digital television receiver. Furthermore, 
application is also conceivable in the case of 

30 interactive systems such as videophones or computer 
games . 
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Patent Claims 

1. Method for projecting sound sources (3) onto 
loudspeakers (2), characterized in that the sound sources 
(3) are interpreted as acoustic objects, an acoustic 

5 object consisting in that in addition to the audio signal 
a sound source is assigned an item of spatial information 
which specifies a virtual, spatial position of the sound 
source . 

2. Method according to Claim 1, characterized in 
10 that the audio signal is processed as a function of the 

associated item of spatial information in order to 
reproduce an acoustic object. 

3. Method according to Claim 2, characterized in 
that the spatial position of the loudspeakers (2) is 

15 additionally considered, the virtual distance of the 
sound source from the loudspeaker being calculated from 
the spatial information and the position of the 
loudspeakers, and separate processing of the audio signal 
for each of the loudspeakers being performed for an 

20 acoustic object. 

4. Method according to Claim 2 or 3, characterized 
in that one or more of the following parameters are 
considered when processing the audio signals: 

- amplitude attenuation, for example by damping 
25 or diffraction (7), 

- a different propagation time for the various 
acoustic objects and loudspeakers (8), 

consideration of the dependence of the 
loudspeaker level on the spatial arrangement by 
30 means of the outer ear function (9) . 

5. Method according to Claim 4, characterized in 
that the frequency dependence of the parameters is also 
considered in processing the audio signals. 

6. Method according to Claim 5, characterized in 
35 that mathematical functions required for considering the 

parameters such as, for example, an attenuation function 
are transmitted and/or stored as a function of the 
distance and/or the angle of deflection. 
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7. Method according to one of the preceding claims, 

characterized in that the data of an acoustic object are 
stored and/or transmitted by means of a compressed data 
stream in accordance with the MPEG-4 Standard. 
5 8. Device for projecting sound sources onto 

loudspeakers, characterized in that the sound sources are 
interpreted as acoustic objects, n x k actuators (7, 8, 
9) being provided for n acoustic objects and k 
loudspeakers, and an actuator carrying out processing of 
10 an acoustic object with reference to one of the 
loudspeakers . 

9. Device according to Claim 8, characterized in 

that an actuator contains at least one of the following 
units: 

15 - a unit (7) for amplitude matching, 

a time-delay unit (8) for correcting the 
different propagation times, 

a unit (9) for considering the outer ear 

function . 

20 10. Device according to Claim 9, characterized in 

that a frequency dependence of the parameters is also 
considered by the actuators, the signals firstly being 
resolved into frequency bands by a split filter (10), the 
individual frequency bands then being processed 

25 individually, and the processed frequency bands 
subsequently being recombined by a merge filter (12) . 
11. Device according to Claim 10, characterized in 

that the split filter and/or the merge filter are part of 
an audio decoder which is present in any case. 

30 12. Device according to one of Claims 8 to 11, 

characterized in that an arithmetic unit is provided 
which calculates the distance of the virtual acoustic 
objects from the respective loudspeakers from an item of 
spatial information transmitted with the audio signal and 

35 the actual position of the loudspeakers. 

13. Method according to one of Claims 8 to 12, 

characterized in that a memory is provided in which the 
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respective loudspeaker positions and/or mathematical 
functions for considering parameters are stored. 

14. Device according to one of Claims 8 to 13, 
characterized in that one or more directional microphones 
are provided which are used to measure the loudspeaker 
position. 

15. Device according to Claim 14, characterized in 
that the directional microphone or the directional 
microphones is/are integrated in a remote control. 
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